Advanced ggplot

Andy Grogan-Kaylor

2021-10-27

Simulated Data From Social Service Agency

These are simulated data designed to be similar to the data that might come from a social service agency.

The data contain the following (simulated) variables: “ID”, “age”, “gender”, “race_ethnicity”, “family_income”, “program”, “mental_health_T1”, “mental_health_T2”, “latitude”, “longitude”.

The mental health variables are scaled to have an average of 100. Lower numbers indicate lower mental health, while higher numbers indicate higher mental health.

There are some differences in mental health status in the data and an interesting exercise could be to use software like Excel, Google Sheets, Tableau or R to try to see which factors predict these differences.

The Data

load("clients.RData")

Review of Basic ggplot

  1. data
  2. aesthetic
  3. geometry
library(ggplot2) # call the library
ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
  stat_summary(fun = mean, # summarizing y 
               geom = "bar") # with bars

Alternate Geometries

Points

ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
  geom_point()

Jittered Points

ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
  geom_jitter()

Dotplots

# install.packages("ggdist")

library(ggdist) # distribution plots

ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
   stat_dots() # dotplot geometry

Boxplots

ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
   geom_boxplot() # boxplot geometry

Use of Color

Color As An Aesthetic Element

e.g. U.N. Blue.

Here the use of color adds aesthetic appeal. We do this by placing color in the geometry.

ggplot(clients,
       aes(x = program,
           y = mental_health_T2)) +
   stat_dots(fill = "#009edb") # dotplot geometry

Color As Information

Compare the minimal and maximal philosophies.

Here, we place color in the aesthetic so that color adds additional information, i.e. the gender of respondents.

ggplot(clients,
       aes(x = program,
           fill = gender, # color as `information`
           y = mental_health_T2)) +
   stat_dots() # dotplot geometry

Color Palettes

viridis color palette

ggplot(clients,
       aes(x = program,
           fill = gender, # color as `information`
           y = mental_health_T2)) +
  stat_dots() + # dotplot geometry
  scale_fill_viridis_d()

Coordinate Systems

Flipping The Coordinates

ggplot(clients,
       aes(x = program,
           fill = gender, # color as `information`
           y = mental_health_T2)) +
  stat_dots() + # dotplot geometry
  scale_fill_viridis_d() +
  coord_flip()

Small Multiples

ggplot(clients,
       aes(x = program,
           fill = gender, # color as `information`
           y = mental_health_T2)) +
  stat_dots() + # dotplot geometry
  scale_fill_viridis_d() +
  coord_flip() +
  facet_wrap(~neighborhood)

Titles, Labels and Theming

ggplot(clients,
       aes(x = program,
           fill = gender, # color as `information`
           y = mental_health_T2)) +
  stat_dots() + # dotplot geometry
  scale_fill_viridis_d() +
  coord_flip() +
  labs(title = "Program Enrollment",
       subtitle = "By Gender Identity",
       caption = "Higher Mental Health Scores Are Better",
       y = "Mental Health At Time 2",
       x = "Program") +
  theme_minimal() +
  theme(plot.title = element_text(size = rel(1.5),
                             color = "darkblue"))